pacman::p_load(tmap, sf, DT, stplanr, tidyverse)Spatial Interactions
1.1 Overview
Spatial interaction represent the dynamic flow or movementmof people, material, or information between locations in geographical space. Each spatial interaction is composed of a discrete origin and destination (OD) pair and represented as a spatial interaction matrix of centroids from origin and destination. The connection between origin and their destination can be visualized by desire lines, typically straight lines that records the the number of people travelling between locations.
Spatial Interaction Models are mathematical models for estimating flows between spatial entities.
There are four main types of spatial interaction models: 1. Unconstrained 2. Production-constrained 3. Attraction-constrained 4. Doubly-constrained
Learning Objectives 1. Build a spatial interaction matrix 2. Construct desire lines in geospatial data 3. Compute a distance matrix
1.2 Load packages
sf performs geospatial data import, integration, processing and transformation. DT enables R data objects (matrices or data frames) to be displayed as tables on HTML pages. tidyverse performs data import, integration, wrangling and visualisation. tmap creates thematic maps. stplanranalyses OD matrix.
1.3 Import data
odbus is an aspatial dataset containing the number of trips by weekdays and weekends from origin to destination bus stops. It reflects the passenger trip traffic and the most recent dataset from September 2023 will be used.
The output indicates 5,714,196 records and 7 fields. The bus stop codes are converted into factor for data handling.
Source: LTA DataMall (Postman URL)
odbus <- read_csv("data/aspatial/origin_destination_bus_202309.csv")
glimpse(odbus)Rows: 5,714,196
Columns: 7
$ YEAR_MONTH <chr> "2023-09", "2023-09", "2023-09", "2023-09", "2023-…
$ DAY_TYPE <chr> "WEEKENDS/HOLIDAY", "WEEKENDS/HOLIDAY", "WEEKDAY",…
$ TIME_PER_HOUR <dbl> 17, 10, 10, 7, 7, 11, 16, 16, 16, 20, 7, 11, 11, 1…
$ PT_TYPE <chr> "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "…
$ ORIGIN_PT_CODE <chr> "24499", "65239", "65239", "23519", "23519", "5250…
$ DESTINATION_PT_CODE <chr> "22221", "65159", "65159", "23311", "23311", "4204…
$ TOTAL_TRIPS <dbl> 1, 9, 2, 6, 1, 2, 18, 3, 2, 1, 2, 5, 3, 5, 5, 19, …
Origin and destination codes will be converted to from character to factor.
odbus <- odbus %>%
mutate(ORIGIN_PT_CODE = as.factor(ORIGIN_PT_CODE),
DESTINATION_PT_CODE = as.factor(DESTINATION_PT_CODE))busstops is a geospatial dataset that contains the detailed information for all bus stops currently serviced by buses, including bus stop code, road name, description, location coordinates. The output indicates that the geospatial objects are point features. There are 5161 features and 3 fields. It is in SVY21 projected coordinates system with XY dimension.
busstop <- st_read(dsn = "data/geospatial/BusStopLocation_Jul2023", layer = "BusStop") %>%
st_transform(crs = 3414)Reading layer `BusStop' from data source
`/Users/chockwankee/Documents/chockwk/ISSS624_Geospatial_Analytics/Hands_on_Ex/Hands_on_Ex03/data/geospatial/BusStopLocation_Jul2023'
using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
glimpse(busstop)Rows: 5,161
Columns: 4
$ BUS_STOP_N <chr> "22069", "32071", "44331", "96081", "11561", "66191", "2338…
$ BUS_ROOF_N <chr> "B06", "B23", "B01", "B05", "B05", "B03", "B02A", "B02", "B…
$ LOC_DESC <chr> "OPP CEVA LOGISTICS", "AFT TRACK 13", "BLK 239", "GRACE IND…
$ geometry <POINT [m]> POINT (13576.31 32883.65), POINT (13228.59 44206.38),…
mpsz is a geospatial dataset from the Master Plan 2019, a forward looking guiding plan for Singapore’s development in the medium term over the next 10 to 15 years published in 2019. Note this mpsz differs from that in previous chapter, Data Wrangling.
The output indicates that the geospatial objects are multipolygon features. There are 332 features and 6 fields. It is in WGS84 projected coordinates system with XY dimension.
mpsz = st_read(dsn="data/geospatial/MPSZ-2019", layer="MPSZ-2019") %>%
st_transform(crs=3414)Reading layer `MPSZ-2019' from data source
`/Users/chockwankee/Documents/chockwk/ISSS624_Geospatial_Analytics/Hands_on_Ex/Hands_on_Ex03/data/geospatial/MPSZ-2019'
using driver `ESRI Shapefile'
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS: WGS 84
glimpse(mpsz)Rows: 332
Columns: 7
$ SUBZONE_N <chr> "MARINA EAST", "INSTITUTION HILL", "ROBERTSON QUAY", "JURON…
$ SUBZONE_C <chr> "MESZ01", "RVSZ05", "SRSZ01", "WISZ01", "MUSZ02", "MPSZ05",…
$ PLN_AREA_N <chr> "MARINA EAST", "RIVER VALLEY", "SINGAPORE RIVER", "WESTERN …
$ PLN_AREA_C <chr> "ME", "RV", "SR", "WI", "MU", "MP", "WI", "WI", "SI", "SI",…
$ REGION_N <chr> "CENTRAL REGION", "CENTRAL REGION", "CENTRAL REGION", "WEST…
$ REGION_C <chr> "CR", "CR", "CR", "WR", "CR", "CR", "WR", "WR", "CR", "CR",…
$ geometry <MULTIPOLYGON [m]> MULTIPOLYGON (((33222.98 29..., MULTIPOLYGON (…
write_rds() write sf tibble data frame into rds file.
write_rds(mpsz, "rds/mpsz.rds")1.4 Prepare data
We will focus on weekday afternoon peak hours to capture high movements and interactions.
Commuting flows on weekday between 6am to 9am are extracted and there are 226,658 records during weekday afternoon peak hours based on bus stop code. The data table can be viewed using datatable() and saved to rds using write_rds().
odbus69 <- odbus %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 6 & TIME_PER_HOUR <= 9) %>%
group_by(ORIGIN_PT_CODE, DESTINATION_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))
datatable(odbus69)RDS files have two main advantages over CSV files: 1. RDS takes up less disk space. 2. RDS format handles different types of objects, such as fitted models or statistical test results, while CSVs can only save rectangular data, such as data frames.
write_rds exports output in rds format.
write_rds(odbus69, "rds/odbus69.rds")read_rds imports rds files.
odbus69 <- read_rds("rds/odbus69.rds")To integrate the planning subzone codes SUBZONE_C of mpsz sf data frame into busstop sf data frame, st_intersection() is used to perform point and polygon overlay and the output will be in point sf object. select() retain only BUS_STOP_N and SUBZONE_C in the busstop_mpsz sf data frame. There are 5,156 bus stops in the subzones. st_drop_geometryremove the geometric information from the output and return the aspatial data.
busstop_mpsz <- st_intersection(busstop, mpsz) %>%
select(BUS_STOP_N, SUBZONE_C) %>%
st_drop_geometry()
datatable(busstop_mpsz)Based on origin bus stop codes, append the planning subzone code from busstop_mpsz data frame onto od_data data frame using left_join(). There are 239,219 commuting flows records consisting of 5,020 origin bus stops.
od_data <- left_join(odbus69, busstop_mpsz,
by = c("ORIGIN_PT_CODE" = "BUS_STOP_N")) %>%
rename(ORIGIN_BS = ORIGIN_PT_CODE,
ORIGIN_SZ = SUBZONE_C,
DESTIN_BS = DESTINATION_PT_CODE)
glimpse(od_data)Rows: 239,219
Columns: 4
Groups: ORIGIN_BS [5,020]
$ ORIGIN_BS <chr> "01012", "01012", "01012", "01012", "01012", "01012", "01012…
$ DESTIN_BS <fct> 01112, 01113, 01121, 01211, 01311, 01621, 01639, 07371, 6001…
$ TRIPS <dbl> 208, 112, 63, 118, 195, 3, 1, 10, 30, 18, 33, 19, 1, 7, 1, 1…
$ ORIGIN_SZ <chr> "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", …
od_data is a large dataset and it is possible to have duplicate records. The output shows 1,154 duplicated rows.
duplicate <- od_data %>%
group_by_all() %>%
filter(n() > 1) %>%
ungroup()
dim(duplicate)[1] 1154 4
Duplicates are removed by retaining unique() records. From previous 239,219 records, 577 rows containing duplication are removed and 238,642 unique records remain in od_data.
od_data <- unique(od_data)
glimpse(od_data)Rows: 238,642
Columns: 4
Groups: ORIGIN_BS [5,020]
$ ORIGIN_BS <chr> "01012", "01012", "01012", "01012", "01012", "01012", "01012…
$ DESTIN_BS <fct> 01112, 01113, 01121, 01211, 01311, 01621, 01639, 07371, 6001…
$ TRIPS <dbl> 208, 112, 63, 118, 195, 3, 1, 10, 30, 18, 33, 19, 1, 7, 1, 1…
$ ORIGIN_SZ <chr> "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", …
Based on destination bus stop codes, append the planning subzone code from busstop_mpsz data frame onto od_data data frame using left_join(). Summing the trips by origin and destination, there are 21,112 OD flows during weekday morning peak hour.
od_data <- left_join(od_data , busstop_mpsz,
by = c("DESTIN_BS" = "BUS_STOP_N"))
od_data <- od_data %>%
rename(DESTIN_SZ = SUBZONE_C) %>%
drop_na() %>%
group_by(ORIGIN_SZ, DESTIN_SZ) %>%
summarise(MORNING_PEAK = sum(TRIPS)) %>%
ungroup()
glimpse(od_data)Rows: 21,112
Columns: 3
$ ORIGIN_SZ <chr> "AMSZ01", "AMSZ01", "AMSZ01", "AMSZ01", "AMSZ01", "AMSZ01…
$ DESTIN_SZ <chr> "AMSZ01", "AMSZ02", "AMSZ03", "AMSZ04", "AMSZ05", "AMSZ06…
$ MORNING_PEAK <dbl> 2287, 10057, 13211, 2836, 6913, 2086, 1558, 2451, 2047, 1…
Save od_data to RDS.
write_rds(od_data, "rds/od_data.rds")od_data <- read_rds("rds/od_data.rds")Step 6
To ensure that commuting flows are between different subzones, the intra-zonal flows within the same subzones will be removed.
od_data1 <- od_data[od_data$ORIGIN_SZ!=od_data$DESTIN_SZ,]1.5 Plot data
Create desire lines
od2line() takes a data frame and a spatial object as inputs and outputs geographic lines, also known as desire lines, to represent movement between origins and destinations.
flowline <- od2line(flow = od_data1,
zones = mpsz,
zone_code = "SUBZONE_C")tmap_options(check.and.fix = TRUE)
tm_shape(mpsz) +
tm_polygons() +
flowline %>%
tm_shape() +
tm_lines(lwd = "MORNING_PEAK",
style = "quantile",
scale = c(0.1, 1, 3, 5, 7, 10),
n = 6,
alpha = 1)
To visualize OD flows above 10000,
tm_shape(mpsz) + tm_polygons() + flowline %>%
filter(MORNING_PEAK >= 10000) %>%
tm_shape() + tm_lines(lwd = "MORNING_PEAK",
style = "quantile",
scale = c(0.1, 1, 3, 5, 7, 10),
n = 6,
alpha = 1)